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that, when this coupling exists, the formation of long-range contacts is forced by the 
previous formation of local contacts. The absence of a strong geometric coupling leads 
to kinetics that are more sensitive to the interaction energy parameters; in this case the 
formation of local contacts is not sufficient to promote the establishment of long-range 
ones when these are strongly penalized energetically, leading to longer folding times. 
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I. INTRODUCTION 



In the last few years the idea that the native geom- 
etry governs the overall folding kinetics of small (typ- 
ically with less than 100 amino acids), single domain, 
two-state proteins has attracted considerable attention 
and prompted several new lines of research Q, [E IE 0, 
IE IE S IE A particularly important observation by 
Plaxco et al. 0, revealed the existence of a strong 
correlation (r =0.92) between the experimental folding 
rates of 24 two-state folders and the so-called contact or- 
der parameter, CO, measuring the average sequence sep- 
aration of contacting residue pairs in the native struc- 
ture relative to the protein chain length. The connec- 
tion between the CO (and in more general terms, the 
native geometry) and the average range of amino acid 
interactions in the native fold has set a new ground for 
discussing an old-debated issue in the protein folding lit- 
erature, namely, that of understanding the roles of lo- 
cal (i.e. close in space and in sequence) and long-range 
(i.e. close in space but distant in sequence) inter-residue 
interactions in the folding dynamics. Results obtained 
within the scope of this debate agree on the idea that 
long-range (LR) interactions play an important role in 
stabilizing the native fold 0,EEE3>0] but there is no 
consensus on their role in the folding kinetics. For ex- 
ample, very early results obtained by Go and Taketomi 
1 lj for a 49-residue chain on a two-dimensional square 
lattice suggest that local interactions accelerate both the 
folding and unfolding transitions. In Ref.|l5j Unger and 
Moult have studied optimized heteropolymer sequences 
with chain length N = 27 on a three-dimensional cu- 



bic lattice and concluded that increasing the strength 
of local interactions increases the ability of sequences to 
fold. In a different study |l3|. Doyle and co-workers have 
found that, in the context of the Zwanzig model, the 
rate of folding increases as the contribution of the local 
interactions to the native state's energy increases. By 
contrast, results obtained by Abkevich et al.\v^ for the 
Miyazawa-Jernigan lattice-polymer model provided evi- 
dence that, under conditions where the native state is sta- 
ble, a 36-residue sequence on a three-dimensional cubic 
lattice folds to a native structure with mostly LR contacts 
two-orders of magnitude faster than a sequence folding 
to a native structure with predominantly local contacts. 
In Ref. 01 Govindarajan and Goldstein have used a 
lattice model in conjunction with techniques drawn from 
the theory of spin glasses and found that optimal con- 
ditions for folding are achieved when local interactions 
contribute little to the native state's energy. More re- 
cently, Gromiha and Selvaraj have analysed the 'global' 
contribution of LR interactions to the folding kinetics by 
introducing a new geometrical parameter named long- 
range order (LRO) 17]. The LRO, that measures the 
number of LR contacts in the native structure relative to 
the protein chain length, was found to correlate as well 
as the CO with the folding rates of 23 (out of the 24) 
two-state folders previously studied by Plaxco et al [Icj . 
This observation emphasizes the relative importance of 
LR interactions in protein folding kinetics. 

In addition, it has been shown recently that the free 
energy landscapes of single domain, two-state folders are 
considerably smoothl22jj23j . This finding led to a re- 
newed interest [E l27l l2Sj in the Go model and other 
modified Go-type interaction schemes since, as for simple 
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proteins, their energy landscapes are relatively smooth 
|28J . Indeed, these models do not take into account the 
sequence's chemical composition and account only for 
attractive interactions between native contacts thereby 
eliminating possible energetic traps. As a consequence 
only geometric traps, resulting from the chain connectiv- 
ity and the geometry of the native fold, will contribute 
to the landscape's ruggedness and thus Go type models 
are said to be frustrated in a 'topological' sense. These 
models are therefore particularly suited to investigate the 
role of the native state's geometry in the folding kinetics 
of simple proteins. 

Motivated by these results, we revisit the Go model to 
investigate the dependence of the folding kinetics on LR 
(and local) interactions for different native geometries. 
Our main finding is that, for Go- type lattice polymers 
with N — 48 amino acids, the LR interactions play a 
crucial role in determining the folding rates and, most im- 
portantly, this effect is strongly dependent on the native 
state's geometry. Indeed, we have found that, depending 
on the native geometry, the dispersion of the simulated 
folding times spans up to « 3 orders of magnitude when 
the relative strength of LR interactions varies from zero 
(only local interactions contribute to the native state's 
energy) to one (only LR interactions contribute to the 
native state's energy). We have also found that, depend- 
ing on native state's geometry, the set-up of LR contacts 
may be strongly associated with the previous formation 
of local contacts. The existence of this geometric cou- 
pling between local and LR contacts explains why the ob- 
served folding kinetics may depend rather weakly on the 
relative energetic contributions of local and long-range 
interactions. In proteins where this geometric coupling 
is stronger, the local contacts promote the establishment 
of LR contacts even when the LR interactions are not 
energetically favored. 

The present article is organized as follows: Section 
II describes the model and methods. In Section III we 
present and discuss the results of the simulations and in 
Section IV we draw the conclusions. 



II. MODEL AND METHODS 

Protein chains with N = 48 amino acids are mod- 
elled as self-avoiding walks on a three-dimensional infi- 
nite lattice. Amino acids are represented by beads of 
uniform size and the peptide bond, that covalently con- 
nects amino acids along the polypeptide chain, is repre- 
sented by a stick of size equal to the lattice spacing. In 
order to mimic protein movement we use the so-called 
'kink-jump' move set includin g co rner-flips, end and null 
moves as well as crankshafts |24|. The Go potential is 
used to model amino acid interactions which means that, 
for a given target native structure, equal stabilizing en- 
ergies (< 0) are ascribed to all the native contacts, i.e. 



contacts between pairs of beads which are present in the 
target, and neutral energies (= 0) are ascribed to non- 
native contacts, i. e., contacts that are not present in 
the target structure. The total energy of a conformation 
P = {fi\ is then given by the contact Hamiltonian 

N 

H({r l }) = Y, B ^m~r' J ), (1) 

i>j 

where the contact function, A(fl — fj), is unity if beads i 
and j form a native contact but are not covalently linked 
and is zero otherwise and the interaction energy param- 
eter is Bij = — e. 

The folding simulations follow the standard Monte 
Carlo (MC) Metropolis algorithm Each MC run 

starts from a randomly generated unfolded conforma- 
tion (typically with less than 10 native contacts) and 
the folding dynamics is traced by following the evolution 
of the fraction of native contacts, Q — q/Q m ax, where 
Qmax = 57 (for chains with length TV = 48) and q is the 
number of native contacts at each MC step. The folding 
time, t, is given by the first passage time (FPT), that is, 
the number of MC steps corresponding to Q = 1.0. 

A. Native structures 

We consider three native structures, displaying dif- 
ferent geometries as measured by the contact order pa- 
rameter. These structures were found by homopolymer 
relaxation. In these MC simulations a homopolymeric 
chain (i.e., a polymer chain composed by beads of a sin- 
gle chemical type) is launched, at temperature T = 0.7, 
from a randomly generated conformation and relaxes, af- 
ter some MC steps, to the minimum energy conforma- 
tion. At each MC step a local random displacement of 
one or two beads, provided by the kink-jump move set, 
is accepted or rejected in accordance with the Metropo- 
lis rule. For each conformation the total energy is given 
by the contact Hamiltonian of Equation ^ where A = 1 
if beads are in contact but not covalentely linked and is 
zero otherwise. The pairwise interaction energy param- 
eter is Bij — —1.0. For homopolymers of chain length 
N = 48 on a three-dimensional cubic lattice the most 
stable conformation, evolving under the Hamiltonian of 
Equation ^ is a cuboid with 57 contacts. Because this 
structure displays a maximum number of contacts it is 
generally referred to as a maximally compact structure. 

In order to emphasize their different geometries, the 
low-CO (0.12) structure, Ti, the intermediate-CO (0.19) 
structure, T2, and the high-CO structure (0.26) T3, are 
represented in Figures ^ ( a M c ) through their contact 
maps |2^. The corresponding three-dimensional struc- 
tures are depicted in Figures dd)-(f). The contact map, 
C, is an A x A matrix with entries Cij = 1 if beads i 
and j are in contact and zero otherwise. In Ref. |l7j] 
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FIG. 1: Contact map and three-dimensional representation of structures Ti (a, d), T2 (b, e) and T3 (c, f). In the contact maps 
the black squares represent the long-range contacts and the white squares stand for the local contacts. 



Gromiha and Sevaraj have found that for real two-state 
proteins the amino acids which are close in space and 
separated by at least 10 to 15 amino acids in sequence 
are important determinants of folding rates. Motivated 
by this finding we define a native contact between two 
beads i and j as a local contact if the backbone sepa- 
ration \i — j\ is such that \i — j\ < 10 or as long-range 
(LR) contact if \i — j\ > 10. In Figure^the black squares 



represent the LR contacts while the white squares stand 
for the local contacts. The LRO parameter is 0.48 for T\, 
0.44 for T2 and 0.92 for T^. We stress that the number of 
LR and local contacts is approximately the same in tar- 
gets Ti and T2- The average LR contact length is 17.1 
for Ti, 20.1 for T2, and 26.5 for T^. Table I summarizes 
the geometric traits of the three target structures. 



TABLE I: Contact order, fraction of long-range native con- 
tacts, Qlr, long-range order and average long-range contact 
length for structures Fi, Y2 and F3. Qlr is the number of 
LR native contacts normalised to the total number of native 
contacts. 
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III. NUMERICAL RESULTS 
A. Simulation temperature 

The folding kinetics depends on the temperature. In- 
deed, when the temperature is very high, all conforma- 
tions are equally foldable and the kinetics becomes in- 
creasingly slower due to rapid interconversions (fluctua- 
tions) between unfolded states (in the high-temperature 
regime the folding time approaches the Levinthal time, 
i.e., it becomes exponential in the number of accessi- 
ble conformations). On the other hand, in the low- 
temperature regime, an Arrhenius-like behaviour, char- 
acterized by trapping into metastable states is expected 
(as discussed in 18] ). Therefore, for kinetically foldable 
proteins, there must exist an intermediate temperature 
where the folding process is fastest. The existence of this 
temperature, called the optimal folding temperature, was 
reported in several studies for lattice models (sequence- 
specific as well as Go models) [3 H |U E| . 

In the present study folding kinetics is studied at the 
optimal folding temperature T* , that is, the temperature 
that minimises the folding time, t. In order to determine 
T* we performed MC simulations over a broad tempera- 
ture range and ran a set of 100 MC simulations at each 
temperature, T. The folding time was then taken as the 
mean FPT to the native structure averaged over the 100 
MC runs. 

Figure[21reports the dependence of the folding time on 
the folding temperature for each structure and e = 0.5. 
At the optimal folding temperature, the kinetics is not 
dominated by kinetic traps, and folding to the native 
state proceeds relatively fast. 

We stress that, at T > T* , the observed dispersion of 
folding times is rather small (5.56 ± 0.04 < log 10 (f) < 
6.11 ± 0.05) and note that such behaviour is typical of 
the Go and other lattice (as well as off-lattice) models 
(Ref.|| and references therein.) 

The functional dependence of the folding time on the 
temperature is qualitatively similar for the three struc- 
tures in the high-T regime. Note that in this regime 
one also observes a small dispersion of the folding times. 
However the reported results show that in the low-T 
regime the dependence of the folding time on temper- 
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FIG. 2: Dependence of the logarithmic folding time, logio(t), 
on the simulation temperature, T, for structures I\, T2 and 

r 3 . 



ature is sensitive to the native state's geometry. In par- 
ticular we have not observed folding to F3 at low tem- 
peratures, T < 0.28. 

The optimal folding temperature, on the other hand, 
appears to be a geometry independent parameter. 



B. Folding kinetics for different range bias 

In this section we investigate the role of LR and lo- 
cal contacts in the kinetics of protein folding by varying 
the relative contributions of LR and local interactions to 
the total energy in the following way: the energy of a 
conformation is given by 

H({fl}) = *H LR ({fl}) + (1 - a)H L ({fl}), (2) 

where the terms Hlr and Hl determine the overall en- 
ergy contribution of long-range and local contacts to the 
conformation's energy and are given by 



N 



H L R(L){{n}) = -J2 A LR(L)(n -fj), 



(3) 



where A.LRiL)^ ~ r j) is unity if beads i and j form a 
LR(local) native contact and is zero otherwise. The pa- 
rameter cr, that we shall call range bias parameter, takes 
values in 0.0 < a < 1.0. When a is zero all LR in- 
teractions are 'switched-off ' and only local interactions 
contribute to the conformation's total energy. The re- 
verse situation is obtained when a — 1.0 as in this case 
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FIG. 3: Dependence of the native state's energy, E, on the 
range bias parameter a for structures Ti, Yi and IV 



only LR interactions contribute to the total energy. Con- 
formation's energies given by Equation [21 imply that the 
native's state energy varies as a function of a. 

Results plotted in Figure 01 illustrate the dependence 
of the native's state energy on the range bias parame- 
ter for targets Ti, r 2 and T3. While targets T% and r 2 
have predominantly local contacts and thus their energy 
increases with a for target T3 the lowest native state en- 
ergy is observed when a = 1.0. Since the native state's 
energy depends on the range bias parameter we have de- 
termined, for each cr, the corresponding optimal folding 
temperature, T*. 

The dependence of the folding time on the range bias 
parameter is reported in Figure Ufa) for the three native 
geometries. For a < 0.20 (rcsp. a < 0.10) we have not 
observed folding to target T 3 (resp. T 2 ). 

The behaviour exhibited by target T3 is easily ex- 
plained: since approximately 80 percent of Ta's native 
contacts are LR there is little competition (and there- 
fore little frustration) between LR and local interactions. 
We stress that, in the present model, the competition 
between local and LR contacts results from the relative 
weight of the two types of interactions (which are al- 
ways stabilizing, i.e., < 0). The resulting frustration is 
therefore different from that of sequence-specific models 
where the energy of the local and of the LR pair interac- 
tions can be stabilizing (i.e., < 0) or de-stabilizing (i.e., 
=0 or > 0). The slight decrease in the folding time ob- 
served for a > 0.5 is driven by the native state's energy 
(since its decrease with cr results in a driving force to fold- 
ing). However, the effect of decreasing a below a = 0.5 is 
equivalent to that of progressively 'switching-off ' the LR 




FIG. 4: (a) Dependence of the logarithmic folding time, 
log 10 (t), on the range bias parameter a for structures Ti, T2 
and with different native state's energies and with a fixed 
native state's energy (b). 



interactions and, in the limit of a — 0, it actually forces 
the structure to fold with only 20 per cent of its native 
interactions. In this case the driving force to folding de- 
creases steadily which results in longer folding times and 
eventually, for a < 0.20, folding failure. The observed 
threshold is smaller for target T 2 because, by contrast 
to the behaviour of target T3 , the native state's energy 
decreases with a (for a < 0.5) and this effect balances 
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that of switching-off the LR interactions. 

The results obtained for the low- and intermediate- 
CO target structures, 1^ and T 2 , are more interesting. 
The corresponding curves are qualitatively similar but a 
closer inspection reveals an important difference, namely: 
for (7 < 0.5 the dependence of the folding time on a 
is much stronger for the intermediate-CO structure, r 2 . 
Indeed, in this case one observes a remarkable three- 
order of magnitude dispersion of folding times, rang- 
ing from log 10 (t m j„) = 5.76 ± 0.05 (for a = 0.65) to 
log 10 (t maa; ) = 8.75 ± 0.05 (for a = 0.10), by contrast 
with r x for which log 10 (i min ) = 5.50±0.08 (for a = 0.70) 
and \og w {t max ) = 7.69 ± 0.09 (for a = 0.00). We note 
that for both structures the folding time increases con- 
siderably faster when a decreases than when a increases 
away from the minimum. However, in the latter case, the 
folding times do not deviate from each other by contrast 
with their behaviour for a < 0.5. We stress that for both 
geometries successful folding is still observed for a = 1.0; 
this corresponds to 'switching-off' all local interactions, 
which are more than half the total number of interactions 
in both structures. 

C. Folding kinetics for different range bias at fixed 
native energy 

In order to rule out differences in the folding dynamics 
driven by the stability of the native state we now inves- 
tigate the contribution of LR and local interactions to 
the folding kinetics of structures T±, r 2 and in the 
following way: the total energy of the native structure is 
kept fixed (and equal to E =-28.5 which is equivalent to 
taking e = 0.5 in Equation ^| while the relative contri- 
butions of LR and local interactions are varied over the 
whole range. We impose the fixed energy constraint by 
taking the total energy given by Equation [21 and using 
for the long-range and local Hamiltonians 

N 

H LR{L) {{rl}) = -J2e(<r)&LR(L)(fl-rj), (4) 

i>j 

with 

(1 - a)Q L + cr(l - Q L ) 

where Ql is the number of local native contacts nor- 
malised to the total number of native contacts. Again, 
the parameter a determines the contribution of local and 
LR contacts to the total energy. For a = 0.0 (a = 1.0) 
only local (LR) contacts contribute to the total energy. 
However, e(cr) that measures the interaction energy of 
all native contacts in Hlr{l) varies with a in order to 
keep the total energy of the conformation constant. Us- 
ing Eqs. |21 El an d El the energy per native contact is 
given by €l = (1 — c)e(cr) if the contact is local while 



it is given by €lr — ce(er) for LR contacts. Figure [5] 
shows the dependence of €lr and on the range bias 
parameter for structures Ti, r 2 and T^. 

We have studied the equilibrium population of states 
in order to investigate the native state's occupation prob- 
ability at the optimal folding temperature as well as its 
dependence on a. To this end long simulations (lasting 
up to 10 8 MC steps) were preformed in order to ensure 
that data was collected under equilibrium conditions 0] . 
The results from these simulations (for the three targets) 
are reported in the histograms of Figure for values 
of d = 0.3,0.5,0.7. The height of each bar in the his- 
tograms, measures the probability occupancy, i.e., the 
number of molecules (normalised to the total number of 
molecules collected in one run) with a fraction of native 
contacts Q. In all the cases considered most molecules 
are in the native state (Q = 1.0). However, when a = 0.3, 
the native state of target Ti , has a rather low probability 
occupancy (less than 0.5). As the stability of the native 
state is estimated as being proportional to the probabil- 
ity of the chain to be in the native conformation [l2T | , we 
conclude that, for a = 0.3, the native state of target Ti 
is not very stable. We note that target F3, which has the 
largest fraction of long-range contacts, exhibits the high- 
est native state ocupation probability for all the three 
values of a. This observation is in line with the idea that 
long-range contacts have a dominant role in stabilizing 
the native fold. 

Figure 0Jb) shows the dependence of the folding time, 
t, on the range bias parameter for the three targets at 
fixed native state's energy. The conclusions drawn for 
the case of varying native state's energy hold equally well 
in the fixed energy case. In particular, the reported re- 
sults confirm the trend for the dependence of T^'s folding 
time on the range bias parameter. We should stress, how- 
ever, that in the present case folding failure is observed 
for a < 0.15 by contrast with the varying energy model 
where no successful folding was observed for a < 0.20. 
We ascribe this behaviour to the stabilizing (or equiv- 
alently, to the lower) native state's energy which com- 
pensates the effect of 'switching-off' the LR interactions. 
Hereafter we will restrict the discussion to the results for 
structures Ti and r 2 . We note that the main difference 
between the fixed and varying energy models is that for 
a > 0.5 the folding times are systematically longer (up 
to one order of magnitude for r 2 ) when the native state's 
energy is kept fixed. Recall from FigureOlthat when the 
native state's energy is allowed to vary it increases with 
a up to E = -23 and E — -20 for targets Ti and T 2 , 
respectively. The fixed native state's energy E = —28.5 
is lower than the varying native energies in the range 
<7 > 0.5. Should native energy play a significant role, 
the folding time's dependence on a for a > 0.5 would 
be less pronounced for the fixed native energy simula- 
tions, in sharp contrast with our findings. Instead, these 
are consistent with the idea that the kinetics of folding is 
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FIG. 5: Dependence of the €lr (energy per long range con- 
tact) and tL (energy per local contact) on the range bias pa- 
rameter a for structures Fi, T2 and T3 in the fixed energy 
model. Also shown (in light grey) the dependence of elr and 
ex, on a in the case of the varying native energy model. 



dominated by the formation of LR contacts. As shown in 
Figure in and F2 the energy bias favouring LR con- 
tacts for g > 0.5 is greater in the fixed energy case. This 
explains the differences in the behaviour of the curves 
corresponding to these two targets in Figures Ufa) and 

H». 

From the results reported in Figure we conclude 
that, by comparison with local contacts, LR contacts play 
a crucial role, in driving the folding kinetics of small Go- 
type lattice polymers. Moreover, the effect of LR con- 
tacts on the kinetics is strongly dependent on the native 
state's geometry. 



D. Native structure and the geometric coupling 
between long-range and local contacts 

In this section we investigate the differences between 
the folding processes of targets Ti and T2 when a is var- 
ied from zero to one in order to interpret the behaviour 
observed in the previous section. 

Recall from section II that our 'reaction' coordinate is 
the fraction of native contacts Q. In general, Q works 
as a thermodynamic reaction coordinate by measuring 
closeness to the native structure in energetic terms only. 
As argued in Ref.|2^| thermodynamic closeness does not 
necessarily imply kinetic proximity to the native struc- 
ture unless the energy landscape is considerably smooth 
|28| (as it happens to be the case in the present study). 
Indeed under such conditions one can take Q as a kinetic 



reaction coordinate so that it actually defines how quickly 
a conformation can convert into the native structure [29j . 
Thus in what follows we assume that Q measures the ki- 
netic progress towards the native state. 

In Figure0we have plotted for targets T\ and T2, and 
for three values of a (namely, 0.5, 0.1 and 1.0), the de- 
pendence on Q of the following kinetic quantities: the 
fraction of LR contacts, qLR, the fraction of local con- 
tacts, qL and the normalized logarithmic folding time, 
logio{t*) (note that in this case the fractions of LR and 
local contacts are normalized to the total number of LR 
and local native contacts, respectively, and not to the 
total number of native contacts; therefore q — 1.0 when 
Q = 1.0 but in general 5 ^ Q and this is why the adopted 
notation is different from that used in the previous sec- 
tions. 

We start by observing the unbiased case, that is a = 
0.5. The kinetics of local contact formation is similar in 
both targets with the fraction of local contacts starting 
from a much larger value than that of LR ones. However, 
long-range contacts form considerably earlier in Ti and, 
in this case, the kinetics of LR contacts follows that of qL- 
In both cases, the normalized folding time is controlled 
by the formation of local contacts. For a = 1.0 local 
interactions are switched-off and this results in an effec- 
tive slowing-down of the corresponding kinetics in both 
targets. Note that, in Ti, qLR grows extremely quickly 
reaching w 90 per cent relatively early in the folding pro- 
cess (when Q = 0.58) when compared with the behaviour 
exhibited for a — 0.5 (90 per cent when Q — 0.92). 
However, this early boost in qLR does not lead to sta- 
ble structure formation as it subsequently drops-down to 
qLR = 0.84 (for Q = 0.86) and is then forced to follow 
the kinetics of local contact formation. For this value of 
a, the folding time is controlled by the setting-up of LR 
contacts in T2 and by that of local contacts in Ti . Finally, 
when a = 0.1, both targets exhibit a similar dependence 
of qL on Q but the kinetics of qLR is slowed-down con- 
siderably in T2- Again, the folding time is controlled by 
qL in Ti and by qLR in T2, but for T2, the setting-up of 
LR contacts is much slower than in the previous cases. 

We interpret these observations in the following way: 
in target Ti there is a strong geometric coupling between 
the formation of local and LR contacts, meaning that the 
establishment of LR contacts is promoted by the estab- 
lishment of local contacts. On the other hand, in target 
T2 , there is no such coupling and this results in an overall 
kinetics which is more sensitive to changes in the energy 
interaction parameters. In particular, in T2, local con- 
tacts are not capable of promoting the setting-up of LR 
contacts when the LR interactions are highly penalized 
energetically (recall that we did not observe successful 
folding to T2 when a < 0.10). 
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IV. CONCLUSIONS AND FINAL REMARKS 

In this paper we have revisited the Go model in order 
to investigate, by means of a new approach, the role of 
long-range (LR) and local interactions in the folding ki- 
netics. We have focused our analysis on lattice-polymers 
with chain length N = 48 since, like real two-state fold- 
ers, these models exhibit relatively smooth energy land- 
scapes. We studied the changes in the folding process 
induced by unbalancing the contributions of local and 
LR interactions to the native state's energy. Our results 
strongly suggest that LR interactions play a dominant 
role in the folding kinetics. Indeed we observe a decrease 
in the folding rates, or equivalently, an increase in the 
folding time, which is clearly more pronounced when the 
contribution of the LR interactions (relative to that of the 
local interactions) to the native state's energy is progres- 
sively decreased towards zero. We have found that this 
effect is essentially independent of the native state's en- 
ergy. By contrast, the kinetic response to decreasing the 
relative contribution of the LR interactions is strongly 
dependent on the target geometry. We have selected our 
target geometries on the basis of differing contact order 
parameters. In the high-CO target, I^, LR contacts are 
the vast majority (44 out of 57 native contacts) and this 
results in a trivial kinetic response: when LR interac- 
tions are strengthened relative to the local ones there are 
no significant changes in the folding rates; on the other 
hand, a strong increase in the folding rates (eventually re- 
sulting in folding failure) arises when they are weakened. 
Interesting behaviour occurs in the folding kinetics of the 
other two structures, the low- and intermediate-CO tar- 
gets, Ti and T 2 , respectively. In both structures local 
contacts dominate and both exhibit a similar fraction of 
local and LR contacts. However, in the intermediate-CO 
target the kinetics is much more sensitive to the weaken- 
ing of LR interactions; in fact in this case we observed a 
remarkable three-order of magnitude dispersion of fold- 
ing times, although this is still two-orders of magnitude 
smaller than the dispersion of folding times of real two- 
state folders (« 5 orders of magnitude) |Toj | . 

The topomer search model (TSM) for protein folding 
(reviewed in [2]]]) considers that the folding time is deter- 
mined by the difusive search for the ensemble of unfolded 
structures that share a similar, global topology with the 
native state (the native topomers) j^. Achieving the 
native topomer corresponds to surmounting the rate lim- 
iting step in folding, which is followed by the zippering of 
specific local native contacts, a process that rapidly leads 
to the native structure. Thus, according to the TSM, the 
rate at which an unfolded protein diffuses between dis- 
tinct topologies is much slower than the rate at which 
local structural elements form. Recent results obtained 
through numerical simulations of the diffusion of Gaus- 
sian chains by Makarov et al. |3lll33| suggest, in the con- 



text of this model, that the logarithmic folding rate grows 
linearly with the number Nlr of LR contact pairs in the 
native structure, which define the topomer. Makarov et 
al. investigated wether the TSM correlates well with 
the folding rates of the 24 two-state folders previously 
studied by Plaxco et al. 10] . To determine Nlr for each 
protein the authors have considered as LR a native con- 
tact where the amino acids are separated by at least 12 
or more residues along the protein backbone. A consid- 
erably strong correlation (r = 0.88) was found between 
the logarithmic folding rates and Nlr HU, suggesting 
that the TSM is indeed a plausible model for two-state 
folding rates. Our results are in broad agreement with 
the TSM in the sense that, irrespectively of target geom- 
etry, we have found that decreasing, versus increasing, 
the relative weigth of LR interactions leads to a more 
pronounced increase of the folding times. However, we 
have also found evidence for a folding mechanism (on the 
lattice) that is different from that of the TSM. According 
to the TSM, the step that determines the folding rates is 
the formation of the LR contacts in the native topomer. 
After this step a rapid zippering of the local contacts oc- 
curs and the native structure forms. Our results show 
that, depending on native geometry, the formation of LR 
contacts may be more strongly coupled with the forma- 
tion of the local contacts. This is clearly illustrated by 
the behaviour of target Ti when a = 0.1 (Figure Q. For 
this value of a the LR contacts are strongly penalized 
energetically, and the folding time is controlled by local 
contact formation. This is not observed for target T2, 
where under the same conditions (er = 0.1) the folding 
time is controlled by LR contact formation (FigureEJ), in 
agreement with the TSM. Another result that supports 
the existence of coupling between local and LR contact 
formation in target Ti is the fact that, again for a = 0.1, 
LR contact formation is much faster for target Ti than 
for target T2 ■ We note that this particular aspect of the 
folding process in lattice models has not been explored 
in previous simulation efforts. 

In a recent study Micheletti et al. [34| have introduce a 
novel method, the so-called 'geometrical variational prin- 
ciple', to investigate the role of native geometry in guid- 
ing the protein to the native fold. This study consisted 
in computing the number of structures that share a cer- 
tain structural similarity with a given native structure 
(the structural similarity between a structure and the 
fixed native fold is measured by the fraction of native 
contacts Q in that structure). The authors have called 
this measure the density of overlapping conformations 
(DOC). A crucial result from this study was the finding 
that the DOC of real natural folds is always much larger 
(at any value of Q) than that found in artificially gen- 
erated structures (with the same chain length and num- 
ber of contacts but differing in the fractions of local and 
non-local contacts). Moreover, the authors found that, 
for Q ~ 0.5, the DOC of real folds is very close to its 
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maximum value and that this 'extremality' of the DCO 
is related with a high content in secondary-structure-like 
motifs (alpha-helices and beta-sheets). In a subsequent 
study Maritan el al. pjif applied a 'dynamical variational 
principle' (DVP) to search for rapid folders in conforma- 
tional space. The authors have found, in the context of a 
Go model on a fee lattice, that decreasing folding times 
are associated with increasing secondary structure con- 
tent (in agreement with Micheletti's results) and with 
decreasing contact order. This finding shows that the 
aplication of the DVP to search for kinetically foldable 
proteins results in the emergence of structures with pre- 
dominantly helical order (i.e., with a high content in local 
contacts) . Since our results were obtained for a cubic lat- 
tice a detailed comparison with Maritan's findings is not 
possible. However, in the contact map of Figure^a), cor- 
responding to Ti, one can clearly identify a pattern that 
resembles that of alpha-helices namely, the existence of 
thick bands parallel to the main diagonal. The fact that 
Ti exhibits the shortest folding times for all values of 
the LR interaction strength is in broad agreement with 
Maritan's results. 

The existence of a geometric coupling between local 
and LR contacts may have implications in what concerns 
the understanding of protein evolution in the sense that 
it provides a possible mechanism for the emergence of 
mutational robustness in proteins. A biological system 
is said to be robust to mutations if it continues to func- 
tion after genetic changes in any of its parts. Native 
structures endowed with a mechanism of local-LR con- 
tact coupling are naturally more capable of exhibiting 
a fast adaptation to mechanisms of biomolecular vari- 
ation (point mutations, insertions, deletions, etc) that 
change the amino acid sequence (i.e. that change the set 
of amino acid interactions) in the following sense. If the 
geometric coupling between local and LR contacts exists, 
one expects the protein's foldability (the protein's abil- 
ity to fold at a reasonably fast rate which is indeed an 
evolutionary advantage) to be less affected by changes 
in the way the amino acids interact since when the LR 
contacts become energetically penalized, as a result of se- 
quence changes, the establishment of local contacts acts 
as a driving force for the establisment of LR ones. 

Finally, it would be interesting to investigate the inter- 
play between target geometry and favored native contact 
interactions in more realistic models, where not only the 
dispersion of folding times of real proteins is reproduced 
as well as other aspects observed in the folding of real 
two-state folders such as the thermodynamic and the ki- 
netic cooperativities [3(|. A simple model that is a step 
in this diection is that of Kaya and Chan who used a 
modified Go-type potential, involving nonadditive mul- 
tybody interactions, to study the folding dynamics of 27- 
mers on a cubic lattice ||. When applied to a pool of 
targets comprising 97 native geometries, chosen on the 
basis of their CO parameters, Kaya and Chan's model 



yielded folding rates spanning more than 2.5 orders of 
magnitude. Furthermore this model also exhibited ther- 
modynamic cooperativity and linear chevron plots (i.e., 
kinetic cooperativity) similar to those observed in exper- 
iments with real proteins. 
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FIG. 6: Population histograms, for targets Ti (first row), T2 (second row) and T3 (third row) and a =0.3, 0.5 and 0.7 at the 
optimal folding temperature. The native state, corresponding to Q — 1.0, is the dominant state for all structures at all values 
of a. Except for structure Ti at cr = 0.3 the native state's occupation probability is larger than 0.5. Target F3, with the largest 
fraction of long-range contacts, exhibits the largest native state ocupation probability at all values of a. This observation is in 
line with the idea that long-range contacts have a dominant role in stabilizing the native fold. 
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FIG. 7: Dependence of the fractions of long-range (LR) and local contacts on the fraction of native contacts, Q, for targets 
Ti and T2, and a = 0.5, 1.0 and 0.1. Note that q is <jlr (i.e., the fraction of LR contacts) for the black bars and q^ (i.e., the 
fraction of local contacts) for the white bars. qLR (?l) is the number of LR (local) contacts normalised to the total number of 
LR (local) contacts in each native structure. Also shown is the dependence of log 10 (t*) on Q where t* is folding time at each 
Q normalised to the folding time at Q — 1.0. 
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